-
Notifications
You must be signed in to change notification settings - Fork 775
feat: multi SCP composition #8917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: multi SCP composition #8917
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
🔍 Preview links for changed docs |
1ff62b3 to
c442adf
Compare
f5ee58f to
6353c66
Compare
6353c66 to
2c4aa4b
Compare
2c4aa4b to
400b02d
Compare
|
Just a general question/comment that came to my mind, do we want to have any limits on the number of SCPs that can be associated to a cluster ? |
pebrc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass, just looking at the code. I have not tested it yet. Will try to find some more time later today.
dfedf81 to
ad6e3b1
Compare
|
buildkite test this -f p=gke,t=TestStackConfigPolicy* |
|
buildkite test this -f p=gke,t=TestStackConfigPolicy* |
pebrc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work!
I think we have two follow-up items:
- improve the error/change attribution
The problem mentioned by @barkbay earlier, is worse for errors that are displayed in the status resource for any contributing policy and we currently leave it up to the user to trace back from which source it came. Can you maybe raise an issue for that?
NAMESPACE NAME READY PHASE AGE WEIGHT
elastic-system elasticsearch-only-policy 1/2 ApplyingChanges 12m 0
elastic-system kibana-only-policy 1/2 ApplyingChanges 12m 9
- documentation (needs to go into the docs-content repo)
|
@pkoutsovasilis wanted to follow up about my comment from earlier wondering if we should document what limits we might hit and maybe do some sort of scale testing to see if at a certain number it becomes problematic ? wdyt ? |
+1 to that. I tested only with a trivial number of SCPs < 10. Maybe we should see if at higher cardinalities our reconciliation algorithm runs into problems. This could inform the documentation we write as a follow up and would allow us to give some guidance to users. Just looking at the max annotation size for the soft-owners and assuming no other annotations exist, we could support probably ~ 2000 owners (63 char ns + 63 char name ~ 130 bytes + 32 bytes for the key). While this won't likely be the limitation I don't think we should advertise this as a limit. My gut feeling is that the number of SCPs should be < 100 and the ones targeting a single ES cluster < 10. But it would be good to have more than gut feel numbers 😄 |
|
Thanks for the follow-up @kvalliyurnatt! To be honest, as @pebrc already captured in his comment above, I'm not particularly worried about the owners annotations - those numbers are way beyond what we would ever advertise. Thinking about it, the limit that's more realistic to reach, I believe, is the stack config secret limit. We can easily hit that even when merging just 2 StackConfigPolicies that are quite lengthy, which could make us exceed the 1MB size limit of a secret. Everything else, except for the stack config secret size and the owners annotation, is pretty much the same, so this PR doesn't make any other limits easier to exceed. So to accommodate for these my proposal is to limit the maximum number of SCPs that can target a given stack component to be |
|
I would not put a limit in code. What I was suggesting was to do a bit more "scale" testing. The overall size limit of the settings file is well known. This PR does not change anything in this regard. We are pretty confident that the annotation size is not a limitation in practice. What we do not know is how the controller behaves if there are very many SCPs that need to be reconciled on each and every change. How many conflicts will we see (if any). Any other issues we don't know of. So all I am arguing for is to test the limits of this implementation a bit before we give it to our users. |
Overview
This PR implements support for multiple StackConfigPolicies (SCPs) targeting the same Elasticsearch cluster or Kibana instance, using a weight-based priority system for deterministic policy composition.
Key Features
Weight-Based Priority System
0Conflict Detection
Conflicts are detected across multiple dimensions and will prevent policy application:
SecretMountwith sameSecretNameSecretMountwith sameMountPathImportant: Even if two policies with the same weight have non-overlapping resources, they still conflict because the weight collision makes the merge order ambiguous.
Configuration Merging Behaviour
Different merge strategies are applied based on the configuration type:
Deep Merge (recursive merging):
ClusterSettingsConfigSnapshotLifecyclePoliciesSecurityRoleMappingsIndexLifecyclePoliciesIngestPipelinesIndexTemplates.ComposableIndexTemplatesIndexTemplates.ComponentTemplatesTop-Level Key Replacement (entire keys replaced):
SnapshotRepositories- each repository configuration is treated atomicallyUnion Merge (with conflict detection):
SecretMounts- conflicts on duplicateSecretNameOR duplicateMountPathSecureSettings- merges bySecretName+Key, lower weight wins (no conflicts)Multi-Soft-Owner Secret Management
File Settings and Policy Config Secrets:
eck.k8s.elastic.co/owner-refsannotation with JSON-encoded map of owner namespaced namesSecret Sources:
This prevents secret leakage while enabling proper cleanup when policies are deleted.
Related Issues